Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biomed Inform ; 118: 103779, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33839304

RESUMO

The automatic recognition of gene names and their corresponding database identifiers in biomedical text is an important first step for many downstream text-mining applications. While current methods for tagging gene entities have been developed for biomedical literature, their performance on species other than human is substantially lower due to the lack of annotation data. We therefore present the NLM-Gene corpus, a high-quality manually annotated corpus for genes developed at the US National Library of Medicine (NLM), covering ambiguous gene names, with an average of 29 gene mentions (10 unique identifiers) per document, and a broader representation of different species (including Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Arabidopsis thaliana, Danio rerio, etc.) when compared to previous gene annotation corpora. NLM-Gene consists of 550 PubMed abstracts from 156 biomedical journals, doubly annotated by six experienced NLM indexers, randomly paired for each document to control for bias. The annotators worked in three annotation rounds until they reached complete agreement. This gold-standard corpus can serve as a benchmark to develop & test new gene text mining algorithms. Using this new resource, we have developed a new gene finding algorithm based on deep learning which improved both on precision and recall from existing tools. The NLM-Gene annotated corpus is freely available at ftp://ftp.ncbi.nlm.nih.gov/pub/lu/NLMGene. We have also applied this tool to the entire PubMed/PMC with their results freely accessible through our web-based tool PubTator (www.ncbi.nlm.nih.gov/research/pubtator).


Assuntos
Drosophila melanogaster , Genes vif , Animais , Mineração de Dados , National Library of Medicine (U.S.) , PubMed , Ratos , Estados Unidos
2.
Mol Microbiol ; 52(4): 1215-23, 2004 May.
Artigo em Inglês | MEDLINE | ID: mdl-15130136

RESUMO

Bacterial plasmids of low copy number, P1 prophage among them, are actively partitioned to nascent daughter cells. The process is typically mediated by a pair of plasmid-encoded proteins and a cis-acting DNA site or cluster of sites, referred to as the plasmid centromere. P1 ParB protein, which binds to the P1 centromere (parS), can spread for several kilobases along flanking DNA. We argue that studies of mutant ParB that demonstrated a strong correlation between spreading capacity and the ability to engage in partitioning may be misleading, and describe here a critical test of the dependence of partitioning on the spreading of the wild-type protein. Physical constraints imposed on the spreading of P1 ParB were found to have only a minor, but reproducible, effect on partitioning. We conclude that, whereas extensive ParB spreading is not required for partitioning, spreading may have an auxiliary role in the process.


Assuntos
Bacteriófago P1/genética , Bacteriófago P1/fisiologia , Plasmídeos/fisiologia , Proteínas Virais/metabolismo , Proteínas de Bactérias/metabolismo , Bacteriófago P1/metabolismo , Divisão Celular , Proteínas de Ligação a DNA/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...